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Abstract. Using methods introduced by Scargle we derive a cumula- 
tive version of the Lomb periodogram that exhibits frequency indepen- 
dent statistics when applied to cumulative noise. We show how this 
cumulative Lomb periodogram allows us to estimate the significance of 
log-periodic signatures in the S&P 500 anti-bubble that started in Au- 
gust 2000. 



1. Introduction 

Speculative bubbles, crashes and depressions are some of the most puz- 
zling phenomena in financial markets. Even though these are rare events 
they occur much more often, than expected from standard models of finan- 
cial markets. For example the otherwise very successfull GARCH-Model 
predicts only one crash of a magnitute comparable to the events of 1929 
and 1987 in 16000 years JS98]. This suggest that the very largest crashes 
are outliers that are caused by phenomena not considered in the standard 
models. 

Several authors |DS96j . |JAF96j . jVBMA98j have argued that this behav- 
ior is natural, if one considers the stock-market as a complex system. When 
this system is far away from critical points, standard models are a good de- 
scription of the systems dynamics, but when a critical point is approached 
the dynamics change completely. In this picture a crash occurs, when a 
critical point is reached. 

The theory of critical phenomena predicts that all observables of a com- 
plex system near a critical point are scale invariant. In the case of stock 
markets this would mean that the price p(t) near a critical point should 
follow a power-law 

logp(i) ~ A + B(t c - tf. 

where t c is the time of crash. Unfortunately it is rather difficult to distinguish 
this from the exponential growth 

logp(t) ~ A + Bt 

predicted by standard theory. 

Help in this situation comes from the concept of discrete scale invariance. 
If the system considered has a hirachical structure it might scale only in 
accordance with these hirachies. This allows the scaling coefficient (3 to be 
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a complex number and the observables will obey 

logp(t) ~R e (A + B(t c - tf) 

= A + B{t c - t) a + C(t c - t) a cos(ujln(t c - t) + 0), 

Such a power-law with log-periodic signature is much easier to detect than 
a pure power-law. In fact sizable log-periodic signatures have been de- 
tected before many financial crashes, including the ones from 1929 and 1987 
(UEEll, |-)AF96j . jVBMA98j . |SL.J99j . HJOO]). Also they have been found 
after the bursting of some speculative bubbles, for example gold after 1980, 
the Nikkei-Index after 1990 and most recently after the bursting of the new 
economy bubble in 2000 (|lJ99j, |SZ02] b 

Still there is some reason for doubt. As several authors have pointed out 
C jH.TL+OOj . jFeiOlbp . log-periodic signatures can often arise by chance in 



systems that do not exhibit discrete scale invariance. Even simple Brownian 
motion produces log-periodic signatures quite often. Therefore there has 
been a heated discussion about the significance of the found log-periodic 
signatures in financial data [FeiOlbj . ISJOlj . jFeiOlaj . 

One tool to detect these signatures has been the Lomb periodogram intro- 
duced by Lomb |Lom76j and improved by Scargle in [Sca82j . An important 
property of Scargle's periodogram is that the individual Lomb powers of in- 
dependently normal distributed noise follow approximately an exponential 
distribution. 

Unfortunately this property is lost, when one calculates Scargle's peri- 
odogram for cumulative noise, where the differences between two observa- 
tions are independently normal distributed. In this Brownian-Motion case, 
the expected Lomb powers are much greater for small frequencies than for 
large ones. Therefore the significance of large Lomb powers at small frequen- 
cies is difficult to estimate. Huang, Saleur, Sornette and Zhou tackle this 
problem a nd several related ones with extensive Monte Carlo simulations in 
|H.TL+00j and [ZS02j . 



In this paper we present an analytic approach to the problem of estimat- 
ing the significance of log-periodic signatures. We start by introducing a 
small correction to Scargle's Lomb periodogram, that makes the distribution 
of Lomb powers exactly exponential for independently normal distributed 
noise. 

In the second part we use the same methods to derive a normalisation of 
the Lomb periodogram that assures an frequency independent exponential 
distribution of Lomb powers for cumulative noise. 

In the last section we apply these new methods to estimate the signifi- 
cance of log-periodic signatures in so called S&P 500-anti-bubble after the 
crash of 2000. We show how our methods greatly simplify the whole anal- 
ysis and derive that there is about a 6% chance that a signature like the 
one detected by Sornette and Zhou in SZ02] arises by chance if one only 
considers frequencies smaller than 10.0. If one searches all frequencies up to 
the Nyquist frequency, peaks of this height become much more common. 

Furthermore we detect equally significant peaks at harmonics of the fun- 
damental frequency of Sornette and Zhou. This complements evidence for a 
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more sophisticated modelling of the S&P 500 anti-bubble by Sornette and 
Zhou in [ZS03] . 

We would like to thank Volker Mdller for hosting our database of financial 
data and donating cpu-time for the Monte Carlo simulations that lead to 
this paper. 



2. Independent Noise 
Consider the classical periodogram 



JVo 2 N 

Xj cos ujt A + -^j s * n ^ 

3=1 3=1 



This is a generalization of the Fourier transform to the case of unevenly 
spaced measurements. Unfortunately this classical form has difficult sta- 
tistical behavior for uneven spacing. Scargle therefore proposed in [$ca82 
a normalized from of the periodogram, that restores good statistical prop- 
erties in the case where all Xj are independently normal distributed with 
mean zero and variance Oq. For this he observes that 

S(w) = 

Xj cos u>tj 

3=1 

and 

No 

C(w) = Xj sinutj 

3=1 

are again normally distributed, with variances 

N 

&c = ^O^o COs2 U ^j 
3=1 

and 

N 

a\ = A^octq ^2 s ^ n2 u ^3 ■ 

3=1 



Also he claims that for normally distributed random variables with unit 
variance their sum of squares is exponentially distributed. Therefore he 
proposes to look at 

2 / \2" 



P(W) 



+ 



Y^liXjsmut, 



0".s 



Scargles claim, that this sum is exponentially distributed is only correct 
when S(u>) and C(cj) are independent. This is approximately true, when 
the observation times tj are "not to badly bunched". In other cases whole 
variance/ covariance matrix 

, ,-2 

S 
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with 

No 

a cs = NqOq cos wtj sin uotj 
3=1 

has to be considered. 

If £ is invertible the values of the quadratic form 



P( W ) = i(C( W ) ) %))E- 1 



are exponentially distributed, i.e. 

Probability (P(u}) > z) = exp(— z). 

by standard facts about multivariate normal distributions (for example Mui82 , 
Theorem 1.4.1]). Explicitly we can calculate 

E -l = 1 ( <?s -°c. 

This means the natural form of the periodogram is 

1 a 2 C 2 (co) + a 2 S 2 (co) - 2a cs C(u)S(u;) 
P{u) = -- 



2 a 2 c a 2 s - a 2 s 

which reduces to Scargle's case for C and S independent (a cs = 0), and to the 
classical case for even spacing (a c = a s ). This P (u) is always exponentially 
distributed. 

As Scargle we also replace tj by tj — r with 

1 Efiism2^- 
r = — arctan — j- T 

in all formulas, to make the diagram time-translation invariant. 



3. Cumulative Noise 

Assume now, that the differences Yj = (Xj — Xj-i) are independently 
normal distributed with zero mean and variance a\ . Then the distribution of 
powers in Scargle's periodogram depends on the angular frequency to = 2nf. 
See figure n for the result of a Monte Carlo simulation of this case. In small 
frequencies much higher Lomb powers can occur by chance, then in high 
frequencies. This reflects the well known fact that the expected value at 
frequency / of the evenly spaced Fourier transform of a brownian motion 
is proportional to 1/f 2 . Our idea is now to normalize the periodogram by 
adjusting Scargle's methods to this case. 

First notice that 

3 

Xj — Xq = Yfc 

k=l 
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in this situation. From this we obtain 

n 

C{uj) = ~^2(Xj — Xq) cos tot j 

3=1 
n j 

j=i k=i 

n n 

= ^2Y k ^2 cos utj 

k=l j=k 

and a similar formula for S(u). Notice that C(to) and S(oj) are still normally 
distributed with 



n n 

^2{Yk,Yi) cos utA (^cosutj 



kl j=k j=l 

COSUltj 



n n „ 



a o 7 Ay 4 ~ ' / 

k=l j=k 



and 



a 2 s = (5 2 ( W )) 



n n 
kl j=k j=l 



n n 



and 



k=l j=k 



n n 

= ^{Yk,^} sin ujtj^j (y^ cosujt j 

kl j=k j=l 

n n n 

fc=l j=k j=k 



With these values 
P(w) 



1 a 2 C 2 (^) + a 2 5 2 (cj) - 2(7 C5 C(^)g(^) 

2 ofof - 



is again exponentially distributed: 

Probability (P(lu) > z) = exp(— z) 

In particular the distribution is now independent of uj as exemplified by 
figure |21 which shows the cumulative Lomb periodogram for a Monte Carlo 
simulation as above. 

Notice that is essential to consider the correlation between C{oj) and S{uj) 
in this case. Figure |3] shows the cumulative Lomb periodograms for 1000 
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random walks of length 500 in logarithmic time without the correlation term 
above. Notice how this introduces spurious peaks at several frequencies. 
Again we replace tj by tj — r with 

1 £fiisrn2c^- 
r = — arctan ■ 



2uJ Ej^iCOs2wtj 
in all formulas, to make the diagram time-translation invariant. 



4. Application to log-periodicity in financial data 

Recently Sornette and Zhou have suggested that there is a log periodic 
signature in the S&P 500 index after the bursting of the new economy bubble 
SZ02 . Among other methods they use Scargle's periodogram with tj = 
log(tj — t c ) and Xj the logarithm of the index price j days after a critical 
date t c for frequencies < / < 10. To account for a nonlinear trend of the 
form 

A + B(tj - t c ) a 

they first detrend the index values. For this they use a value of A obtained 
from a nonlinear fit of 

(*) A + B(tj - t c ) a + C(tj - t c ) a cos(w ln(t,- - t c ) + <p) 

to the logarithm of the index price and then determine B and a for different 
choices of t c by linear regression. 

We follow the same procedure, but estimate A by a different method. For 
this we consider the correlation coefficient 

Zi&-t)(}og(Xj-A)-X) 

y/Ztfi ~ t) 2 y/ZiQ°g(Xi -*)- xy 

with i the average of the tj's and X the average of the log(Xj — Ays. This 
correlation coefficient measures how well log(Xj — A) can be approximated 
by a linear model 

log 5 + alog(tj - t c ). 

Notice that va does not depend on B and a. Since correlation coefficiens 
near +1 and —1 indicate strong explanatory value of the proposed trend, 
while coefficients near indicate that a trend with this value of A seems not 
to be present, we first search the value of A that maximises r \. Then, like 
Sornette and Zhou, we find B and a by linear regression 

With this method we address a critique of Feigenbaum |Fei01aj . He points 
out that the detection of log-periodicity by lomb periodograms is not inde- 
pendent of the detection via nonlinear fits of equation (*) to the price data, 
if values produced by the second are reused in the first. Our approach 
eliminates this dependance. 

Figure |1] shows the highest Lomb powers obtained by this method for 2- 
year intervals starting from August 1st to September 5th, 2000. The highest 
peak in our dataset is observed for the critical date August 22th, 2000. This 
is in reasonable agreement with the critical date found by Sornette and Zhou 
(August 9th, 2000). 
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To estimate the significance of this peak we calculated the 203 Lomb 
periodograms of 2-year intervals starting at the 22nd of each month from 
Jannuary 1984 to November 2000. Figure shows a comparison of these 
periodograms with the one from August 22nd, 2002. Notice that the dis- 
tribution of Lomb-powers depends strongly on the frequency. For small 
frequencies high Lomb powers are much more probable than at high fre- 
quencies. The absence of large powers for very small frequencies is due to 



the detrending procedure. These facts have also been observed by jH.TL + 00 
in a somewhat different setting. 

Notice also that the peak at / ~ 1.6 is nevertheless quite large, in fact it 
is the largest one observed at this frequency. But how probable is it that 
we have a peak of this relative height for any of the tested frequencies? The 
SnP500 data set is not large enough, to estimate this probability accurately, 
but a count shows that there are 39 datasets that have at least one frequency 
with a peak that is higher than any of the other at the same frequency. This 
gives a naive estimate of 39/202 i=a 19,3%. 

The cumulative Lomb periodogram proposed above improves and greatly 
simplifies this analysis. Figure |H] shows the cumulative Lomb periodograms 
of the S&P 500 anti-bubble together with the cumulative Lomb periodograms 
of all other datasets. A detrending of the data as above was not necessary. 
Notice that there are now several peaks of comparable size at frequencies 
1.7, 3.4, 7.4 and 8.4. The fact that these lie close to the harmonics of 1.7 
compares well to the results of Z S03j . 

We have included the theoretical 99.9%, 99% and 95%-quantiles for each 
frequency derived above together with the 95%-quantiles of the actual S&P 
500 data. Notice the excellent agreement for frequencies greater than 1. 

Since the height of the peaks is now largely independent from the fre- 
quency, we can estimate the global significance of the peak at 1.7 with 
cumulative Lomb power 5.61 by counting the number of cumulative Lomb- 
periodograms with at least one peak of higher power for frequencies < 
/ < 10. We find 12 of those which implies a significance of approximately 
12/203 ~ 5.9%. In figure [7| we compare the global significance of peaks in 
cumulative Lomb periodograms of the S&P 500 with those obtained from a 
Monto Carlo simulation of brownian motion with zero mean and unit vari- 
ance . There is a reasonable agreement for significances smaller than 0.1 
even though brownian motion is only a very rough model of stock market 
prices. 

Notice that the global significance of a peak depends strongly on the 
number of frequencies considered. If we consider all frequencies up to the 
Nyquist frequency for the average sampling interval 

_ N _ 500 ^ 
JN 2T 21og(500) ~ ' ' 

a peak of height 5.61 as above is found in 27.6% of simulated periodograms 
and in 28.3% of the historical S&P 500 periodograms. In this light the 
choice of < / < 10 by Sornette and Zhou seems to call for an apriori 
justification if one wants to keep up the claim of a significant log-periodic 
signature around / = 1.7. 
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5. Conclusion 

We have indrocuced a new version of the Lomb periodigram that exhibits 
good statistical properties when applied to cumulative noise. With this we 
were able to detect the log-periodic signature in the S&P 500 anti-bubble 
with better significance than with the ususal periodigram, even without a 
detrending procedure. More importantly this method allows us to estimate 
the significance of the found log periodic signature which is a reasonable 
5.9% if we only consider fequencies smaller than 10.0, and a disappointing 
27.6% if one considers all frequencies up to the Nyquist frequency for the 
average sampling interval. We also detect cumulative Lomb peaks of similar 
significance at harmonics of the fundamental frequency 1.7. 
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Figure 1. Scargle's periodogram for 1000 random walks 
with normal distributed innovations of zero mean and unit 
variance. Each random walk has 500 steps which are assumed 
to occur at times log(l) . . . log(500) 



10 HANS-CHRISTIAN GRAF V. BOTHMER 



Cumulative Lomb Periodograms for Cumulative Noise 

calculated for 1000 random walks with 500 steps in logarithmic time 



O cumulative lomb powers 



I 10 - O % - 




Frequency ff) 



Figure 2. Cumulative Lomb periodogram for 1000 random 
walks with innovations of zero mean and unit variance. Each 
random walk has 500 steps which are assumed to occur at 
times log(l) . . . log(500). For the normalisation do = 1 has 
been used. 
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Figure 3. Cumulative Lomb periodograms without the cor- 
rection for correlation between C(u) and S(u>) of 1000 ran- 
dom walks with innovations of zero mean and unit variance. 
Each of the random walks has 500 steps which are assumed to 
occur at times log(l) . . . log(500). oq was estimated for each 
dataset. The ommission of the correction terms introduces 
spurious peaks at several frequencies. 
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Figure 4. Highest peaks in Scargle's periodograms of 2-year 
windows of S&P 500 data starting from different days in 
August and September 2000. Before calculating the peri- 
odogram the price data has been detrended according to the 
procedure described in the text. 



SIGNIFICANCE OF LOG-PERIODIC SIGNATURES IN CUMULATIVE NOISE 13 

S&P 500 - Lomb Periodograms 

2 year windo ws, starting at then 22nd of each month from Jan. 1984 to No v. 2000 




Frequency (f) 



Figure 5. 202 Scargle periodograms for 2-year windows of 
S&P 500 data starting from 22nd of each month. Before cal- 
culating the periodogram the price data has been detrended 
according to the procedure described in the text. For one 
window the detrending procedure did not converge 
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Cumulative Lomb Periodigrams of the S&P 500 

2 year windows starting on the 22nd of each month from 1 984/01 to 2000/1 1 
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Figure 6. 203 cumulative periodograms for 2-year windows 
of S&P 500 data starting from 22nd of each month. The 
price data has not been detrended. <7q = 0.000119403 has 
been estimated from the total 18 years of data. 
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Figure 7. Highest peaks of cumulative periodograms for 
1000 random walks compared with those of 203 cumulative 
periodograms for 2-year windows of S&P 500 data starting 
from 22nd of each month. The price data has not been de- 
trended, ctq = 0.000119403 has been estimated from the total 
18 years of data. 
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